Impact of an AI app‐based exercise program for people with low back pain compared to standard care: A longitudinal cohort‐study

Abstract Background Low Back Pain (LBP) is a common health problem worldwide. In recent years, the use of mobile applications for the treatment of various diseases has increased, due to the Corona pandemic. Objective The aim of this study is to investigate the extent to which artificial intelligence (AI)‐assisted exercise recommendations can reduce pain and pain‐related impairments in daily life for patients with LBP, compared to standard care. Methods To answer the research question, an 8‐week app‐based exercise program was conducted in the intervention group. To measure the influence of the exercise program, pain development and pain‐related impairment in daily life have been evaluated. A so‐called rehabilitation sports group served as the control group. The main factors for statistical analysis were factor time and group comparison. For statistical calculations, a mixed analysis of variance for pain development was conducted. A separate check for confounders was made. For pain impairment in daily life nonparametric tests with the mean of change between the time points are conducted. Results The intervention group showed a reduction in pain development of 1.4 points compared to an increase of 0.1 points in the control group on the numeric rating scale. There is a significant interaction of time and group for pain development. Regarding pain‐related impairments in daily life, the intervention group has a reduction of the oswestry disability index scores by 3.8 points compared to an increase of 2.3 in the control group. The biggest differences become apparent 8 weeks after the start of treatment. The significant results have a medium to strong effect. Conclusion The results shown here suggest that the use of digital AI‐based exercise recommendations in patients with LBP leads to pain reduction and a reduction in pain‐related impairments in daily living compared to traditional group exercise therapy.


| METHOD
The study was conducted in accordance with the ethical principles of the Declaration of Helsinki and in consideration of the STROBE guideline. 18 The complete STROBE checklist can be found in the appendix. A positive ethical vote was given by the ethics committee of the Osnabrück University of Applied Sciences.

| Trial design
A longitudinal cohort study was conducted to answer the previously stated research question. For this purpose, two groups were formed-the intervention group (AI recommended app exercise program) and the control group (usual care as group sport-so called rehabilitation sport in Germany). Randomization, as well as blinding of the subjects, was not possible, since the control group was permanently enrolled in the rehabilitation sports group. The typical biases of a cohort study (selection bias and information bias) were attempted to be addressed at an early stage. 19

| Participants'
The following inclusion criteria were applied: (1) Men and women with back pain, and (2)  In a first step prospective participants were informed via flyers, posters, social media, and direct approaches in physical therapy practices. In this way, interested study participants received the first information about the planned study. If interested, prospective participants were able to contact the research group via phone or mail. When contacting the research group, the prospective participants were informed about the inclusion and exclusion criteria and open questions could be clarified. Only by contacting the research group, interested participants were able to participate in the study. At the beginning of the study, the participants signed a consent form, which could be revoked at any time without giving reasons.

| Interventions
The control group-usual care-received group sport over 12 weeks as part of the so-called rehabilitation sport. This involved a group session one to two times a week for about 45 min. The exercises included strengthening, stretching, and/or mobilization exercises and are designed to treat and train the complaints in the neck and back areas. The intervention group received recommended exercises via an app with the help of an artificial intelligence (AI), which adapts the exercise recommendations to the input of the user. The intervention group was able to perform a maximum of five exercises daily.

| Outcomes
The primary outcome was a pain score using the NRS. The scale describes the pain rating on a scale of 0−10. A change of 2 points is classified as clinically relevant. 20 Secondary outcomes were pain-related impairment in daily living, measured with the oswestry disability index (ODI). 21  Outcomes were reported using an online questionnaire at t0 (baseline), t1 (after 4 weeks), and t2 (after 8 weeks). The questionnaire was preceded by a written declaration of consent regarding the processing of personal data. The initial questionnaire (t0-Baseline measuring) included a query of demographic and general data on age, gender, occupation, daily work routine, and frequency of sporting activities. Red flags and exclusion criteria were determined by means of specific questions and, if necessary, excluded from the study.

| Statistical analysis
The aim of the statistical analysis was to investigate effects of an appbased exercise program compared to usual care in people with back pain over several measurement time points.
For the statistical evaluation, all personal data was anonymized so that no inference could be drawn about the person. Due to fixed conditions (fixed sample size according to rehabilitation sport groups), no sample size calculation was performed previously. Only participants with scores at every time point will be included.
For baseline analysis group comparison will be conducted. For categorical variables absolute numbers, percentages, odds ratios (OR) and lower (l) and upper (u) confidence interval (CI) and for continuous variables mean, standard deviation (SD), standardized mean difference and CI were used. For comparison χ 2 -test (categorical variables) and independent t-test (continuous variables) or Mann−Whitney U Test is conducted. The evaluation of significance (p) was set at p = 0.05 or p = 0.017 (Bonferroni-Correction). The effect size will be evaluated with Cohens d, based on z-statistics as r, or based on the η² as f. The interpretation of the effect size is based on Cohen. 23 All analyses were two-sided and conducted with the statistical software package of SPSS (version 27) by IBM.
To answer this research question, the following hypotheses were generated and statistically tested as we proceeded.
H1: There is a significant difference between the intervention group and the control group in terms of pain change measured by the NRS.
H2: There is a significant difference between the intervention group and the control group in terms of pain-related impairment in daily living measured by the ODI.

| Content and description of the medicalmotion app
Medicalmotion GmbH offers a mobile and a web app for users with various kinds of pain and complaints as well as for prevention. In addition to AI-based exercise recommendations, the mobile app also includes relaxation exercises, podcasts, a chat function, and a health cockpit, where the users can update and track their condition. The AI-based exercise recommendations consider the data from an anamnestic questioning including for example, several pain attributes (like localization, type, and quality) and lifestyle-related information, as well as the daily wellbeing and pain sensation. Furthermore, the user's feedback from previously performed exercises is considered in the recommendation process.

| Sample size
The 58 participants were on average 44.5 (SD 12.3) years old. Most of them had a sitting (44%) or a sitting and standing (44%) working position. Furthermore, 60% do sports one to three times a week.
On average, the members of the intervention group were 41.4 (SD 12.5) years old and most of them do sports one to three times a week (48%). The most frequent working position is sitting (55%). In the intervention group are more female participants (59%). At baseline the pain development score is 5.9 (SD 2) and the pain impairment in daily life score is 15.2 (SD 8.9).
On average, the participants in the control group were 50.9 (SD 9.3) years old and most of them do sports one to three times a week (86%).
The most frequent working position is sitting and standing (64%). Male and female gender are equally represented in the control group.
At baseline the pain development score in the control group is 6.3 (SD 1.6) and the pain impairment in daily life score is 22.7 (SD 5).
At baseline sports duration differs between control group and intervention group significantly in following durations: never (p = 0.008) and one to three times (p = 0.02). The biggest difference is that 11 people in the intervention group reported not doing any sports, to no one in the control group. The working position "sitting" (p = 0.04) is also significantly different. The mean age of the control group is 50.9 years and for the intervention group 41.4 years (p = 0.02).
At every time point, the control group has significantly higher scores in the ODI questionnaire compared to the intervention group.
Despite at baseline similar result is shown for pain scores at 4 weeks and 8 weeks. The control group reported higher scores at those time points. Table 1 shows the descriptive statistics at baseline and the value for pain and pain-related impairment in daily living over time divided by groups.

| Pain development
The intervention group showed a change on the NRS from 5.9 (t0) to 4.79 (t1) and 4.5 (t2). This represents a reduction of 1.4 points. The control group showed a change on the NRS from 6.3 (t0), 6.4 (t1) to 6.4 (t2). This represents an increase of 0.1 points. Statistical analysis showed no significant effect at baseline (p = 0.45), after 4 weeks (p = 0.02) and significant differences after 8 weeks (p = 0.003) when comparing between groups.
There was a statistically significant interaction between time and group, F (2, 82) = 3.791, p = 0.03, partial η² = 0.085. The significant difference after eight weeks has a strong effect size of 1 (Cohens). indicates a strong effect. 23 In Table 2b the statistical data for the ANOVA with repeated measurements is presented.
As seen in Figure 2 the intervention group has a reduction of 1.4 points on the NRS scale over 8 weeks. The control group has an increase on the NRS of 0.1 points in the same time frame.

| Pain impairment in daily life
The intervention group showed a change on the ODI from 15.2 (t0), 13 (t1) to 11.4 (t2). This represents a reduction of 3. There is a statistically relevant difference between groups, if we look at the scores at each time point. If you look at the score of change, the group comparison showed no significant differences.
Within the intervention group, the factor time seems to have an F I G U R E 1 Flow chart influence. The difference from baseline to 8 weeks shows a significant result with p = 0.008. The effect size r ranges between small to medium. 23 In Table 3 the statistical analysis (Wilcoxon and Mann−Whitney U test) for pain impairment in daily life is presented.
As seen in Figure 3, the pain impairment in daily life described by the mean of the ODI scores reduces by 3.8 in the intervention group compared to an increase by 2.3 in the control group after 8 weeks. The significant reduction after 8 weeks in the intervention group has a medium effect size.  Other authors around Chhabra et al. 27 and Shebib et al. 16  etc.), studies regarding risk assessments should not be ignored. For example, Jain et al. 28 conducted a study on this topic. They found that only 0.00014 adverse events (e.g., increase in pain) were recorded per active day within the app. 28 However, this was not considered negative because the adverse events were associated with the normal changes of LBP patients. 28 The results obtained in the longitudinal cohort study

| CONCLUSION
This study investigated the impact an AI app-based exercise program may have on pain and pain-related impairments compared to so-called rehabilitation exercise in patients with LBP. Significant results were shown for the group comparison after eight weeks as well as the comparison in the intervention group from t0 to t2 in terms of pain reduction and reduction of pain-related impairments in daily living.
With the help of these results, it has been shown that a digital health application could be a very promising alternative to normal rehabilitation sport. We therefore recommend that serious consideration be given to the expansion of digital care services in the existing health system, such as the integration of digital solutions in physiotherapeutic care or in the rehabilitation process. In summary, it can be concluded that the use of digital solutions can improve patient reported outcomes and can be an alternative for existing and established solutions in the health care system.

AUTHOR CONTRIBUTIONS
All authors have read and approved the final version of the manuscript. Annika Griefahn had full access to all of the data in this study and takes complete responsibility for the integrity of the data and the accuracy of the data analysis.

ACKNOWLEDGMENTS
This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors. Open access funding enabled and organized by Projekt DEAL.

CONFLICTS OF INTEREST
A. G. and F. A. are employees of medicalmotion GmbH. The remaining authors declare no conflict of interest. The access codes were provided by medicalmotion GmbH to the University of Applied Sciences Osnabrück for the purpose of conducting the study.

DATA AVAILABILITY STATEMENT
The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.

TRANSPARENCY STATEMENT
The lead author Annika Griefahn affirms that this manuscript is an honest, accurate, and transparent account of the study being reported; that no important aspects of the study have been omitted; and that any discrepancies from the study as planned (and, if relevant, registered) have been explained.